282 research outputs found

    Fusing Data with Correlations

    Full text link
    Many applications rely on Web data and extraction systems to accomplish knowledge-driven tasks. Web information is not curated, so many sources provide inaccurate, or conflicting information. Moreover, extraction systems introduce additional noise to the data. We wish to automatically distinguish correct data and erroneous data for creating a cleaner set of integrated data. Previous work has shown that a na\"ive voting strategy that trusts data provided by the majority or at least a certain number of sources may not work well in the presence of copying between the sources. However, correlation between sources can be much broader than copying: sources may provide data from complementary domains (\emph{negative correlation}), extractors may focus on different types of information (\emph{negative correlation}), and extractors may apply common rules in extraction (\emph{positive correlation, without copying}). In this paper we present novel techniques modeling correlations between sources and applying it in truth finding.Comment: Sigmod'201

    ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution

    Full text link
    Entity resolution (ER), an important and common data cleaning problem, is about detecting data duplicate representations for the same external entities, and merging them into single representations. Relatively recently, declarative rules called matching dependencies (MDs) have been proposed for specifying similarity conditions under which attribute values in database records are merged. In this work we show the process and the benefits of integrating three components of ER: (a) Classifiers for duplicate/non-duplicate record pairs built using machine learning (ML) techniques, (b) MDs for supporting both the blocking phase of ML and the merge itself; and (c) The use of the declarative language LogiQL -an extended form of Datalog supported by the LogicBlox platform- for data processing, and the specification and enforcement of MDs.Comment: To appear in Proc. SUM, 201

    Protomers of Benzocaine: Solvent and Permittivity Dependence

    Get PDF
    The immediate environment of a molecule can have a profound influence on its properties. Benzocaine, the ethyl ester of para-aminobenzoic acid, which finds an application as a local anesthetic (LA), is found to adopt in its protonated form at least two populations of distinct structures in the gas phase and their relative intensities strongly depend on the properties of the solvent used in the electrospray ionization (ESI) process. Here we combine IR-vibrational spectroscopy with ion mobility-mass spectrometry (IM-MS) to yield gas-phase IR spectra of simultaneously m/z and drift-time resolved species of benzocaine. The results allow for an unambiguous identification of two protomeric species - the N- and O-protonated form. Density functional theory (DFT) calculations link these structures to the most stable solution and gas-phase structures, respectively, with the electric properties of the surrounding medium being the main determinant for the preferred protonation site. The fact that the N-protonated form of benzocaine can be found in the gas phase is owed to kinetic trapping of the solution phase structure during transfer into the experimental setup. These observations confirm earlier studies on similar molecules where N- and O-protonation has been suggested

    Spectroscopic Evidence for an Oxazolone Structure in Anionic b-Type Peptide Fragments

    Get PDF
    Infrared spectra of anionic b-type fragments generated by collision induced dissociation (CID) from deprotonated peptides are reported. Spectra of the b2 fragments of deprotonated AlaAlaAla and AlaTyrAla have been recorded over the 800–1800 cm–1 spectral range by multiple-photon dissociation (MPD) spectroscopy using an FTICR mass spectrometer in combination with the free electron laser FELIX. Structural characterization of the b-type fragments is accomplished by comparison with density functional theory calculated spectra at the B3LYP/6-31++G(d,p) level for different isomeric structures. Although diketopiperazine structures represent the energetically lowest isomers, the IR spectra suggest an oxazolone structure for the b2 fragments of both peptides. Deprotonation is shown to occur on the oxazolone α-carbon, which leads to a conjugated structure in which the negative charge is practically delocalized over the entire oxazolone ring, providing enhanced gas-phase stability

    Insight into the Stability of Cross-β Amyloid Fibril from VEALYL Short Peptide with Molecular Dynamics Simulation

    Get PDF
    Amyloid fibrils are found in many fatal neurodegenerative diseases such as Alzheimer's disease, Parkinson's disease, type II diabetes, and prion disease. The VEALYL short peptide from insulin has been confirmed to aggregate amyloid-like fibrils. However, the aggregation mechanism of amyloid fibril is poorly understood. Here, we utilized molecular dynamics simulation to analyse the stability of VEALYL hexamer. The statistical results indicate that hydrophobic residues play key roles in stabilizing VEALYL hexamer. Single point and two linkage mutants confirmed that Val1, Leu4, and Tyr5 of VEALYL are key residues. The consistency of the results for the VEALYL oligomer suggests that the intermediate states might be trimer (3-0) and pentamer(3-2). These results can help us to obtain an insight into the aggregation mechanism of amyloid fibril. These methods can be used to study the stability of amyloid fibril from other short peptides

    Substituent Effects in the Noncovalent Bonding of SO2 to Molecules containing a Carbonyl Group. The Dominating Role of the Chalcogen Bond

    Get PDF
    The SO2 molecule is paired with a number of carbonyl-containing molecules, and the properties of the resulting complexes are calculated by high-level ab initio theory. The global minimum of each pair is held together primarily by a S···O chalcogen bond wherein the lone pairs of the carbonyl O transfer charge to the π* antibonding SO orbital, supplemented by smaller contributions from weak CH···O H-bonds. The binding energies vary between 4.2 and 8.6 kcal/mol, competitive with even some of the stronger noncovalent forces such as H-bonds and halogen bonds. The geometrical arrangement places the carbonyl O atom above the plane of the SO2 molecule, consistent with the disposition of the molecular electrostatic potentials of the two monomers. This S···O bond differs from the more commonly observed chalcogen bond in both geometry and origin. Substituents exert their influence via inductive effects that change the availability of the carbonyl O lone pairs as well as the intensity of the negative electrostatic potential surrounding this atom
    corecore